A Data Mining Approach for Detecting Collusion in Unproctored Online Exams
J. Langerbein(1), T. Massing(1), J. Klenke(1), N. Reckmann(1), M. Striewe(1),
M. Goedicke(1), C. Hanck(1)
(1) University of Duisburg-Essen; Germany
Setting
- Data from the Descriptive Statistics course at the University Duisburg-Essen, Germany
- Exams consist of arithmetical problems, programming tasks in
R, and a short essay task
- Both exams are conducted digitally with the e-assessment system JACK
- Each student receives different randomized numerical values across all tasks
- Event logs capture students’ activities, time stamps, and points during the exams for every subtask
- The test group took the unproctored exam at home during the COVID-19 pandemic
- The comparison group took a proctored exam in the facilities of the university
Table 1: Overview over the test and comparison group
|
|
Comparison
|
Test
|
|
Year
|
18/19
|
20/21
|
|
N
|
109
|
151
|
|
Style
|
proctocred
|
unprocotored
|
|
Total points
|
60
|
60
|
|
Sub tasks
|
19
|
17
|
|
Duration
|
70
|
70
|
- Data cleaning is conducted, removing students with minimal participation or achievement and students with internet problems
Methodology
- The study utilized an agglomerative (bottom-up) hierarchical clustering algorithm that can be described by following equation:
\[D(s_i, s_{i'}, v_i, v_{i'}) = \frac{1}{h} \sum_{j=1}^h (w_j^P \cdot d_j^P (s_{ij}, s_{i'j}) + w_j^L \cdot d_j^L (v_{ij}, v_{i'j}))\]
- \(D(s_i, s_{i'}, v_i, v_{i'})\) the global pairwise dissimilarity
- \(d_j^P(s_{ij}, s_{i'j})\) points dissimilarity for each task \(j\)
- \(d_j^L(v_{ij}, v_{i'j})\) students event patterns dissimilarity for each task \(j\)
- \(\displaystyle \sum_{j=1}^h w_j^P + w_j^L =1\) control the influence of each attribute on the global object dissimilarity
- We reduce the weights for
R-tasks, as these tasks have more noise
- Essay questions, as the comparison on that kind of task are limited
- Points achieved
- Dissimilarities in points achieved for each task \(j\)
\[d_j^P(s_{ij}, s_{i'j}) = | s_{ij} - s_{i'j} |\]
\[d_j^L(v_{ij}, v_{i'j}) = \sum_{m=1}^{K=70} | v_{ijm} - v_{i'jm} |\]
- \(d_j^L(v_{ij}, v_{i'j})\) students event patterns dissimilarity for each task \(j\)
- Examination is divided into \(m = 1, ... , 70\) time intervals
- \(v_{ijm}\) denotes the number of answers of student \(i\) for task \(j\) in the \(m\)-th interval
- Manhatten metric
Empirical Results
- Figure 1 shows the dendrogram of the test group
- Overall a lower level of dissimilarity compared to the comparison group
- Six clusters (A-F) standing out noticeably from the rest of the cohort
- Figure 2 illustrates the individual comparison of achieved points and event logs of the student cluster with the highest similarity
- Similar time path
- Same points for each task
- Figure 3 compares the normalized distributions of the dissimilarity measures between the comparison and test groups
- Clusters A, B and E standing out
Discussion
- Three notable clusters (A, B, and E) consisting of two students each
- Collusion in larger groups are not found
- These results are also found with other linkage methods and parameter specifications as weightings
- The approach provides a basis for the examination of clusters based on comparison with a reference group
- The elevated risk of detection may indeed discourage students from cheating in unproctored exams
Further Research
- Long-term efficacy of the collusion detection method during exams and its impact on academic integrity and student behavior
- Refining methods for gathering and analyzing supplementary evidence